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Abstract 

We consider the Monge-Kantorovich transport problem in a purely measure the- 
oretic setting, i.e. without imposing continuity assumptions on the cost function. 
It is known that transport plans which are concentrated on c-monotone sets are 
optimal, provided the cost function c is either lower semi-continuous and finite, or 
continuous and may possibly attain the value oo. We show that this is true in a more 
general setting, in particular for merely Borel measurable cost functions provided 
that {c = oo} is the union of a closed set and a negligible set. In a previous paper 
Schachermayer and Teichmann considered strongly c-monotone transport plans and 
proved that every strongly c-monotone transport plan is optimal. We establish that 
transport plans are strongly c-monotone if and only if they satisfy a "better" notion 
of optimality called robust optimality. 

Key words: Monge-Kantorovich problem, c-cyclically monotone, strongly 
c-monotone, measurable cost function 



1 Introduction 

We consider the Monge-Kantorovich transport problem (//, u, c) for Borel prob- 
ability measures //, v on Polish spaces X, Y and a Borel measurable cost func- 
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tion c:Xx7-> [0, oo] . As standard references on the theory of mass transport 
we mention [T1l9lll4|ll5] . By v) we denote the set of all probability mea- 
sures on X x Y with X-marginal /i and y-marginal v. For a Borel measurable 
cost function c : X xY — > [0, oo] the transport costs of a given transport plan 
7r G Il(yU, v) are defined by 

J c [7r] := / c(x,y)dir. (1) 

JXxY 

7r is called a finite transport plan if I c [n] < oo. 

A nice interpretation of the Monge-Kantorovich transport problem is given by 
Cedric Villani in Chapter 3 of the impressive monograph [13] : 

"Consider a large number of bakeries, producing breads, that should be trans- 
ported each morning to cafes where consumers will eat them. The amount of 
bread that can be produced at each bakery, and the amount that will be con- 
sumed at each cafe are known in advance, and can be modeled as probability 
measures (there is a "density of production" and a "density of consumption") 
on a certain space, which in our case would be Paris (equipped with the natural 
metric such that the distance between two points is the length of the shortest 
path joining them). The problem is to find in practice where each unit of bread 
should go, in such a way as to minimize the total transport cost." 

We are interested in optimal transport plans, i.e. minimizers of the functional 
I c [-] and their characterization via the notion of c-monotonicity. 

Definition 1.1 A Borel set T C X x Y is called c-monotone if 

n n 

J2 C ( X ^ Vi) ^ J2 C ( X i> Vi+l) ( 2 ) 
i=l i=l 

for all pairs (xi, yi), . . . , (x n , y n ) G T using the convention y n+ \ := y%. A 
transport plan tt is called c-monotone if there exists a c-monotone F with 

Tr(r) = i. 

In the literature (e.g. [Tir5]fTf5]fTH] ) the following characterization was estab- 
lished under various continuity assumptions on the cost function. Our main 
result states that those assumptions are not required. 

Theorem 1 Let X, Y be Polish spaces equipped with Borel probability mea- 
sures fi, v and let c : X x Y — > [0, oo] a Borel measurable cost function. 

a. Every finite optimal transport plan is c-monotone. 

b. Every finite c-monotone transport plan is optimal if there exist a closed 
set F and a // (g> v-null set N such that {(x, y) : c(x, y) = oo} = F U N . 
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Thus in the case of a cost function which does not attain the value oo the 
equivalence of optimality and c-monotonicity is valid without any restrictions 
beyond the obvious measurability conditions inherent in the formulation of 
the problem. 

The subsequent construction due to Ambrosio and Pratelli in [1, Example 3.5] 
shows that if c is allowed to attain oo the implication "c-monotone => optimal" 
does not hold without some additional assumption as in Theorem [TJb. 

Example 1.2 (Ambrosio and Pratelli) Let X = Y = [0,1], equipped with 
Lebesgue measure A = fi = v. Pick a G [0, 1) irrational. Set 



where © is addition modulo 1. Let c : X x Y — > [0, oo] be such that c = a G 
[0, oo) on Tq, c = b G [0, oo) on I\ and c = oo otherwise. It is then easy 
to check that Tq and I\ are c-monotone sets. Using the maps fo, fi : X — > 
X x Y, fo(x) = (x,x), fx(x) = (x, x © a) one defines the transport plans 
7To = /o#A, 7Ti = /i#A supported by Tq respectively T\. Then ttq and 7i"i are 
finite c-monotone transport plans, but as I c [vr ] = a, I c \ki\ — b it depends on 
the choice of a and b which transport plan is optimal. Note that in contrast to 
the assumption in TheoremUlb the set {(x, y) G X x Y : c = oo} is open. 

We want to remark that rather trivial (folkloristic) examples show that no 
optimal transport has to exist if the cost function doesn't satisfy proper con- 
tinuity assumptions. 

Example 1.3 Consider the task to transport points on the real line (equipped 
with the Lebesgue measure) from the interval [0,1) to [1,2) where the cost 
of moving one point to another is the squared distance between these points 
(X = [0,1), Y = [1,2), c(x,y) = (x — y) 2 , \i = v = X). The simplest way to 
achieve this transport is to shift every point by 1 . This results in transport costs 
of 1 and one easily checks that all other transport plans are more expensive. 

If we now alter the cost function to be 2 whenever two points have distance 1, 
i.e. if we set 



it becomes impossible to find a transport plan rr G n(/i, v) with total transport 
costs Ic\k\ = 1, but it is still possible to achieve transport costs arbitrarily close 
to 1. (For instance, shift [0, 1 — e) to [1 + e, 2) and [1 — e, 1) to [1, 1 + e) for 
small e > 0.) 



T = {(x, x) : x G X}, T 1 



{(x, x © a) : x G X}, 
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1.1 History of the problem 



The notion of c-monotonicity originates in convex analysis. The well known 
Rockafellar Theorem (see for instance [TTJ Theorem 3] or [T4"l Theorem 2.27]) 
and its generalization, Riischendorf's Theorem (see P21 Lemma 2.1]), char- 
acterize c-monotonicity in R n in terms of integrability. The definitions of c- 
concave functions and super-differentials can be found for instance in [T3J 
Section 2.4]. 

Theorem (Rockafellar) A non-empty setT C l"xl" is cyclically monotone 
(that is, c-monotone with respect to the squared euclidean distance) if and only 
if there exists a l.s.c. concave function : W 1 — > M. such that T is contained 
in the super- differential d((p). 

Theorem (Riischendorf ) Let X, Y be abstract spaces and c : X x Y — > 
[0, oo] arbitrary. Let rcixF be c-monotone. Then there exists a c-concave 
function <p : X — > Y such that T is contained in the c- super- differential d c (ip). 

Important results of Gangbo and McCann [3] and Brenier [HJ Theorem 2.12] 
use these potentials to establish uniqueness of the solutions of the Monge- 
Kantorovich transport problem in R" for different types of cost functions sub- 
ject to certain regularity conditions. 

Optimality implies c-monotonicity: This is evident in the discrete case if 
X and Y are finite sets. For suppose that 7r is a transport plan for which c- 
monotonicity is violated on pairs (xi,yi), . . . , (x n , y n ) where all points xx,...,x, 
and yi, . . . ,y n carry positive mass. Then we can reduce costs by sending the 
mass a > 0, for a sufficiently small, from Xj to y^\ instead of y^ that is, we 
replace the original transport plan tt with 

n n 
TT 13 = TT + a $(x im+1 ) -OtJ2 6 {xi,Vi> ( 3 ) 

1=1 1=1 
(Here we are using the convention y n+1 = 

Gangbo and McCann ([31 Theorem 2.3]) show how continuity assumptions on 
the cost function can be exploited to extend this to an abstract setting. Hence 
one achieves: 

Let X and Y be Polish spaces equipped with Borel probability measures fj,,u. 
Let c : X x Y — > [0, oo] be a l.s.c. cost function. Then every finite optimal 
transport plan is c-monotone. 

Using measure theoretic tools, as developed in the beautiful paper by Kellerer 
[B], we are able to extend this to Borel measurable cost functions (Theorem 
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CD a.) without any additional regularity assumption. 



c-monotonicity implies optimality: In the case of finite spaces X, Y this again 
is nothing more than an easy exercise ([TJJ Exercise 2.21]). The problem gets 
harder in the infinite setting. It was first proved in [3] that for X, Y compact 
subsets of M™ and c a continuous cost function, c-monotonicity implies opti- 
mality. In a more general setting this was shown in [H Theorem 3.2] for l.s.c. 
cost functions which additionally satisfy the moment conditions 



Further research into this direction was initiated by the following problem 
posed by Villani in (HI Problem 2.25]: 

For X = Y = W n and c(x,y) = \\x — y\\ 2 , the squared euclidean distance, does 
c-monotonicity of a transport plan imply its optimality? 

A positive answer to this question was given independently by Pratelli in [8] 
and by Schachermayer and Teichmann in [13]. Pratelli proves the result for 
countable spaces and shows that it extends to the Polish case by means of 
approximation if the cost function c : X x Y — > [0, oo] is continuous. The 
paper [13] pursues a different approach: The notion of strong c-monotonicity 
is introduced. From this property optimality follows fairly easily and the main 
part of the paper is concerned with the fact that strong c-monotonicity follows 
from the usual notion of c-monotonicity in the Polish setting if c is assumed 
to be l.s.c. and finitely valued. 

Part (b) of Theorem [T] unifies these statements: Pratelli's result follows from 
the fact that for continuous c:IxF^[0,oo] the set {c = oo} = c _1 [{oo}] 
is closed; the Schachermayer- Teichmann result follows since for finite c the set 
{c = oo} is empty. 

Similar to [13] our proofs are based on the concept of strong c-monotonicity. 
In Section 11.21 we present robust optimality which is a variant of optimality 
that we shall show to be equivalent to strong c-monotonicity. As not every 
optimal transport plan is also robustly optimal, this accounts for the somewhat 
provocative concept of "better than optimal" transport plans alluded to in the 
title of this paper. 

Correspondingly the notion of strong c-monotonicity is in fact stronger than 
ordinary c-monotonicity (at least if c is allowed to assume the value oo). 




>0, 



> 0. 
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1.2 Strong Notions 



It turns out that optimality of a transport plan is intimately connected with 
the notion of strong c-monotonicity introduced in 



Definition 1.4 A Borel set T C X x Y is strongly c-monotone if there exist 
Borel measurable functions tp : X — > [—00, 00) and ip : Y — > [—00, 00) such 
that tp(x) + ip(y) < c ( x , y) f or a U ( x yV) & X x Y and (p(x) + ip{y) = c(x, y) 
for all (x,y) G T. A transport plan it G II(/i, v) is strongly c-monotone if n is 
concentrated on a strongly c-monotone Borel set V. 

Strong c-monotonicity implies c-monotonicity since 

n n n n 

c (^+l, Vi) > J2 <P( x i+l) + ^(Vi) = J2 ^i) + 1>(Vi) = J2 c ( x i> Vi) ( 4 ) 
i=l i=l i=l i=l 

whenever (xi,yi),..., (x n , y n ) G T . 

If there are integrable functions ip and ip witnessing that 7r is strongly c- 
monotone, then for every tt G II^u, v) we can estimate: 

I c [7r] = / c(x,y)dn = / [<p(x) +^(y)}dn = 



tp(x)dfi+ / i/j(y)dv= / [<p(x) +ip(y)]dir < I c [tt\. 
r Jv Jr 

Thus in this case strong c-monotonicity implies optimality. However there is 
no reason why the Borel measurable functions ip, if) appearing in Definition 
11.41 should be integrable. In [T51 Proposition 2.1] it is shown that for l.s.c. 
cost functions, there is a way of truncating which allows to also handle non- 
integrable functions p and ip. The proof extends to merely Borel measurable 
functions; hence we have: 

Proposition 1.5 Let X,Y be Polish spaces equipped with Borel probability 
measures fi, v and let c : X x Y —>■ [0, 00] be Borel measurable. Then every 
finite transport plan which is strongly c-monotone is optimal. 

No new ideas are required to extend [T3J Proposition 2.1] to the present setting 
but since Proposition 11.51 is a crucial ingredient of several proofs in this paper 
we provide an outline of the argument in Section 3. 

As it will turn out, strongly c-monotone transport plans even satisfy a "better" 
notion of optimality, called robust optimality. 

Definition 1.6 Let X, Y be Polish spaces equipped with Borel probability mea- 
sures /1, v and let c : X x Y — ► [0, 00] be a Borel measurable cost function. A 
transport plan n G II(/i, v) is robustly optimal if, for any Polish space Z and 
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any finite Borel measure X > on Z , there exists a Borel measurable extension 
c : (X U Z) x (Y U Z) -> [0, oo] satisfying 



c(a, b) 
c(a,b) = < 



/or 



for 



aeX,beY 



a,b £ Z 



< oo 



otherwise 



such that the measure if :— 7r+ (ie^ x idz) #X is optimal on (X U Z) x (Y U Z) . 
Note that tt is not a probability measure, but has total mass 1 + \(Z) G [1, oo). 

Note that since we allow the possibility X(Z) = every robustly optimal 
transport plan is in particular optimal in the usual sense. 

Robust optimality has a colorful "economic" interpretation: a tycoon wants 
to enter the Parisian croissant consortium. She builds a storage of size X(Z) 
where she buys up croissants and sends them to the cafes. Her hope is that 
by offering low transport costs, the previously optimal transport plan tt will 
not be optimal anymore, so that the traditional relations between bakeries 
and cafes will collapse. Of course, the authorities of Paris will try to defend 
their structure by imposing (possibly very high, but still finite) tolls for all 
transports to and from the tycoon's storage, thus resulting in finite costs c(a, b) 
for (a, b) G (X x Z) U [Z x Y). In the case of robustly optimal ir they can 
successfully defend themselves against the intruder. 

Every robustly optimal transport tt plan is optimal in the usual sense and 
hence also c-monotone. The crucial feature is that robust optimality implies 
strong c-monotonicity. In fact, the two properties are equivalent. 

Theorem 2 Let X, Y be Polish spaces equipped with Borel probability mea- 
sures fj,, v and c : X x Y — > [0, oo] a Borel measurable cost function. For a 
finite transport plan tt the following assertions are equivalent: 

a. tt is strongly c-monotone. 

b. tt is robustly optimal. 

Example 15.11 below shows that robust optimality resp. strong c-monotonicity 
is in fact a stronger property than usual optimality. 



1.3 Putting things together 

Finally we want to point out that in the situation where c is finite all previously 
mentioned notions of monotonicity and optimality coincide. We can even pass 



7 



to a slightly more general setting than finite cost functions and obtain the 
following result. 

Theorem 3 Let X, Y be Polish spaces equipped with Borel probability mea- 
sures fi, v and let c : X xY —>■ [0, oo] be Borel measurable and \i®v-a.e. finite. 
For a finite transport plan tt the following assertions are equivalent: 

(1) tt is optimal. 

(2) tt is c-monotone. 

(3) tt is robustly optimal. 

(4) tt is strongly c-monotone. 

The equivalence of (1), (2) and (4) was established in [T3] under the additional 
assumption that c is l.s.c. and finitely valued. 



We sum up the situation under fully general assumptions. The upper line (1 
and 2) relates to the optimality of a transport plan tt. The lower line (3 and 
4) contains the two equivalent strong concepts and implies the upper line but 
- without additional assumptions - not vice versa. 

(1) optimal 

(2) c-monotone 

(3) robustly optimal 

(4) strongly c-monotone 

Fig. 1. Implications between properties of transport plans 



Thm. 1, 



Thm. 2 



Thm. 3 



Note that the implications symbolized by dotted lines in Figure 1 are not true 
without additional assumptions ((2) (1): Example II .2\ (1) ^> (3) resp. (4): 
Example 15.11) . 

The paper is organized as follows: In Section 2 we prove that every optimal 
transport plan tt is c-monotone (Theorem [TJ a). In Section 3 we introduce an 
auxilliary property [connectedness) of the support of a transport plan and 
show that it allows to pass from c-monotonicity to strong c-monotonicity. 
Moreover we establish that strong c-monotonicity implies optimality (Propo- 
sition [L5]). Section 4 is concerned with the proof of Theorem [Ub. Finally we 
complete the proofs of Theorems [2] and [3] in Section 5. 

We observe that in all the above discussion we only referred to the Borel 
structure of the Polish spaces X, Y, and never referred to the topological 
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structure. Hence the above results (with the exception of Theorem [TJb.) hold 
true for standard Borel measure spaces. 



In fact it seems likely that our results can be transferred to the setting of 
perfect measure spaces. (See [TD] for a general overview resp. [H] for a treatment 
of problems of mass transport in this framework.) However we do not pursue 
this direction. 

Acknowledgement. The authors are indebted to the extremely careful ref- 
eree who noticed many inaccuracies resp. mistakes and whose insightful sug- 
gestions led to a more accessible presentation of several results in this paper. 



2 Improving Transports 

Assume that some transport plan it G n(/i, v) is given. From a purely heuristic 
point of view there are either few tupels ((xx, yi), ■ ■ ■ , (x n , y n )) along which c- 
monotonicity is violated, or there are many such tuples, in which case 7r can 
be enhanced by rerouting the transport along these tuples. As the notion of 
c-monotonicity refers to n-tuples it turns out that it is necessary to consider 
finitely many measure spaces to properly formulate what is meant by "few" 
resp. "many". 

Let Xx, ■ ■ ■ , X n be Polish spaces equipped with finite Borel measures fii, . . . , 
By n(/xi, . . . , fi n ) C A4(Xx x ■ ■ • x X n ) we denote the set of all Borel measures 
on Xx x ■ ■ • x X n such that the i-th marginal measure coincides with the Borel 
measure ^ for i — 1, . . . , n. By p Xi '■ Xx x • • • x X n — > Xj we denote the pro- 
jection onto the i-th component. B C Xx x • • • x X n is called an L-shaped null 
set if there exist null sets Nx Cl 1) ...,JV Tl Cl fI such that B C [J^xPxlWi]- 

The Borel sets of X% x • • • x X n satisfy a nice dichotomy. They are either 
L-shaped null sets or they carry a positive measure whose marginals are ab- 
solutely continuous with respect to . . . , /i n : 

Proposition 2.1 Let Xx, ■ ■ ■ ,X n ,n > 2 be Polish spaces equipped with Borel 
probability measures . . . , // n . Then for any Borel set B C Xx x • • • x X n let 



Then P(B) > l/nL(B). In particular B satisfies one of the following alter- 
natives: 

a. B is an L-shaped null set. 



P(B) := sup {n(B) : tt G U{fi 1} . . . , fi n )} 



(5) 



( n n 

L(B) := inf £ /^) : B t C X t and B C Q Px\ W 



(6) 
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b. There exists tt G n(/xi, . . . , /i n ) such that ir(B) > 0. 

The main ingredient in the proof Proposition 12.11 is the following duality the- 
orem due to Kellerer (see [61 Lemma 1.8(a), Corollary 2.18]). 

Theorem (Kellerer) Let Xi, . . . ,X n ,n > 2 be Polish spaces equipped with 
B orel probability measures fi\, . . . , fi n and assume that c : X = X\ x • • • xX n — > 
R .Bore/ measurable and that c := sup x c, c := infx c are finite. Set 

1(c) =inf jV c dn : ic E II(/xi, . . . ,/i n )| , 

{n . n 1 

V/ yjj d/Xi : c(xi, . . . ,£„) > y^ipifa), -c 



(c - c) < Ifi < -c . 

n 



Then 1(c) = S(c) 



PROOF of Proposition [2H1 Observe that -I(-1 B ) = P(B) and that 

{n , n ~\ 

E/ XidfJ-i ■ 1b(xi, ■ ■ ■ ,x n ) < Y.Xi(xi),0 < Xi < 1 \. (7) 
i=i JXi i=i ) 

By Kellerer's Theorem —S(—1b) = —I (—1b)- Thus it remains to show that 
—S(—1b) > l/nL(B). Fix functions Xii---iXn as i n O- Then for each 
(xi, . . . ,x n ) G B one has 1 = 1b(%i, ■ ■ ■ ,x n ) < YJl=i Xi( x i) an d hence there 
exists some i such that Xi{ x i) — V n - Thus B C \Ji=iPx][{Xi > I/ 71 }]- ^ 
follows that 

{n , n ~\ 

E / XidfM :5C[J > 1/n}], < X , < 1 

i=i JXi i=i J 

(- n i n ^11 

> inf E -0i({x< > 1/"}) -B C |J piJ[{ Xi > 1/n}] > -L(B) 
U=i n »=i J n 

From this we deduce that either L(-B) = or that there exists 7r G H(fii, ■ ■ ■ , /x n ) 
such that 7r(S) > 0. The last assertion of Proposition 12.11 now follows from 
the following Lemma due to Richard Balka and Marton Elekes (private com- 
munication). □ 

Lemma 2.2 Suppose that L(B) = for a Borel set B C X\ x • • • x X n . Then 
B is an L-shaped null set. 



PROOF. Fix e > and Borel sets B\ k > , with ^(Bf ') < e 2~ k such 
that for each k 

BCp-;[5jV...Up^[Sf)]. 
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Let Bi : = \JT=i B i for i = 2, . . . , n such that 

B<Zp x \[Bf ) ]VJp x \[B 2 ]VJ...VJp x \[B n ] 

for each k e N. Thus with £1 := n£°=i £?\ 

S^xI[5i]U P aB 2 ]U...Up^[B fl ]. 

Hence we can assume from now on that ni(Bi) = and that fii(Bi) is arbi- 
trarily small for i = 2, . . . , n. Iterating this argument in the obvious way we 
get the statement. □ 

Remark 2.3 In the case n = 2 it was shown in Proposition 3.3] that 
L(B) = P{B) for every Borel set B C X\ x X 2 . However, for n > 2, equality 
does not hold true, cf. Fj| Example 3.4]- 

Definition 2.4 Let X, Y be Polish spaces. For a Borel measurable cost func- 
tion c : X X Y — > [0, oo], n G N and e > we set 

I n n I 

B n ,e ■= { {Xi,yi)?=i e (X x Y) n : XX^yi) > ^c(xi,y i+ i) +e L (8) 



The definition of the sets B n e is implicitly given in [3J Theorem 2.3]. The idea 
behind it is, that (x{, y%ji = \ £ B UiE tells us that transport costs can be reduced 
if u Xi is transported to y i+ i instead of y" (recall the conventions x n+ i = x\ 
resp. y n +i = Hi)- In what follows we make this statement precise and give a 
coordinate free formulation. 

Denote by a, r : (X x Y) n — > (X x Y) n the shifts defined via 

o- : (Xi, yi)i =1 i * 2/ i+ i)" =1 (9) 

t : (ari, 2/*)^=! fa, 3/i+i)£=i- (10) 

Observe that a n = r n = Id(xxy) n an d that a and r commute. Also note that 
the set B ne from (JSJ) is cx-invariant (i.e. a(B nfi ) = B n>e ), but in general not 
r-invariant. Denote by pi : (X x Y) n — > X x Y the projection on the z-th 
component of the product. The projections px '■ X x Y — > X, (x, y) i— »■ x and 
: X x y — > Y, (ar, y) i— > y are defined as usual and there will be no danger 
of confusion. 

Lemma 2.5 Lei X, Y 6e Polish spaces equipped with Borel probability mea- 
sures fj,, v. Let it be a transport plan. Then one of the following alternatives 
holds: 

a. 7r is c-monotone, 
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b. there exist n G N, e > and a measure k G II(7r, . . . , tt) suc/i i/ia£ 
k{B U)£ ) > 0. Moreover k can be taken to be both a and r invariant. 

PROOF. Suppose that B n%e is an L-shaped null set for all n G N and every 
e > 0. Then there are Borel sets £ , . . . , S^ £ C X x Y of full 7r-measure such 
that 

« e x ... x S^) n B n , E = 
and tt is concentrated on the c-monotone set 

oo oo n 

s = n n n 

fc=l n=l i=l 

If there exist n G N and e > such that B U}£ is not an L-shaped null set, we 
apply Proposition 12.11 to conclude the existence of a measure k G IT(7r, . . . ,7r) 
with k(B HjE ) > 0. To achieve the desired invariance, simply replace n by 

1 n 

^Ef^'^V □ (ii) 

We are now in the position to prove Theorem [Ha, i.e. 

Let X, K be Polish spaces equipped with Borel probability measures fi, v and 
let c : X x Y — ► [0, oo] fre a Borel measurable cost function. If tt is a finite 
optimal transport plan, then tt is c-monotone. 

PROOF. Suppose by contradiction that tt is optimal, I c [tt] < oo but tt is not 
c-monotone. Then by Lemma 12.51 there exist n G N, £ > and an invariant 
measure k E H(tt, . . . ,tt) which gives mass a > to the Borel set B n ^ £ C (X x 
Y) n . Consider now the restriction of k to B n£ defined via k(A) := k(AC\ B njE ) 
for Borel sets A C (X x Y) n . k is cr-invariant since both the measure k and the 
Borel set B n £ are a- invariant. Denote the marginal of k in the first coordinate 
(X x Y) of (X x Y) n by tt. Due to a-invariance we have 

p i# k = p i# (a # k) = fa o a)#k = p i+ i#k, 

i.e. all marginals coincide and we have k G U(tt,...,tt). Furthermore, since 
k < k, the same is true for the marginals, i.e. tt < tt. Denote the marginal of 
r#k in the first coordinate (X x Y) of (X x Y) n by ftp. As a and r commute, 
r#k is cr-invariant, so the marginals in the other coordinates coincide with ftp. 
An easy calculation shows that ft and ftp have the same marginals in X resp. 
Y: 

Px#ft(3 = Px#(Pi#(T#k)) = (px o Pi o T ) # k = (p x o pi) # k = p x #ft, 
p Y #ft(S = p Y #(Pi#(T#k)) = (p Y op t o T ) # k = (p Y o p i+ i)#k = p Y #ft. 
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The equality of the total masses is proved similarly: 

a = TCp(X xY) = ^ o r) # k(X xY)= p i# k(X x Y) = tt(X x Y). 



Next we compute the transport costs associated to 7173: 



'XxY 



cdTCa = / COp l d(T4ik 

J(XxY) n 



E 



1 n 

-E 

nf^J(XxY)™ 



co Pi d(r # k) 



(c o p i o r) d& 



1 n r 

- E / (copiOr)dK 



< 



n JB n 



i=l 

edit — £—. 
XxY n 



To improve the transport plan tt we define 

TTfS := (TT - 7r) + 7T/3. 



(marginal property 
(cx-invariance 
(push-forward 
(definition of k 

(definition of 5 ra!£ 
(definition of 7r 



(12) 



Recall that 7r — fr is a positive measure, so Tip is a positive measure. As ft and 
itp have the same total mass, Tip is a probability measure. Furthermore n and 
7^ have the same marginals, so itp is indeed a transport plan. We have 

f a 
Ic[np] = / c [tt] + / cdfo-Tr) </ c [7r]-e- < I c [tt]. □ (13) 

J XxY n 



3 Connecting c-monotonicity and strong c-monotonicity 



The Ambrosio-Pratelli example (Example 1 1.2 1) shows that c-monotonicity need 
not imply strong c-monotonicity in general. Subsequently we shall present a 
condition which ensures that this implication is valid. 

A c-monotone transport plan resists the attempt of enhancement by means 
of cyclically rerouting. This, however, may be due to the fact that cyclical 
rerouting is a priori impossible due to infinite transport costs on certain routes. 

Continuing Villani's interpretation, a situation where rerouting in this con- 
sortium of bakeries and cafes is possible in a satisfactory way is as follows: 
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Suppose that bakery x = x$ is able to produce one more croissant than it 
already does and that cafe y is short of one croissant. It might not be possible 
to transport the additional croissant itself to the cafe in need, as the costs 
c(x, y) may be infinite. Nevertheless it might be possible to find another bak- 
ery x\ (which usually supplies cafe yi) such that bakery x can transport (with 
finite costs!) the extra croissant to y\\ this leaves us with a now unused item 
from bakery x\, which can be transported to y with finite costs. Of course we 
allow not only one, but finitely many intermediate pairs (xi, yi), . . . , (x n , y n ) 
of bakeries/cafes to achieve this relocation of the additional croissant. 

In the Ambrosio-Pratelli example we can reroute from a point (x, x © a) G Ti 
to a point (x, x © a) G Ti only if there exists n G N such that x © (na) = x. 
In particular, irrationality of a implies that if we can redirect with finite costs 
from (x, x © a) to (x, x © a) we never can redirect back from (x, x © a) to 
(x, x © a). 

Definition 3.1 Let X, Y be Polish spaces equipped with Borel probability mea- 
sures fi,u, let c : X x Y —>■ [0, oo] be a Borel measurable cost function and 
r C X x Y a Borel measurable set on which c is finite. We define 

a - { x i V) ~ V) if there exist pairs (x ,y ), . . . , (x n ,y n ) G T such that 
(x, y) = (x , y ) and (x, y) = (x n , y n ) and c{x 1 ,y ), c(x n , y n _i) < oo. 
b. (x, y) w (x, y) if (x, y) < (x, y) and (x, y) > (x, y) . 

We call (r, c) connecting if c is finite on T and (x,y) ~ (x, y) for all (x,y), 
(x,y) G T. 

These relations were introduced in [T5], Chapter 5, p. 75] and appear in a con- 
struction due to Stefano Bianchini. 

When there is any danger of confusion we will write < Cj r and ~ Cj r) indicating 
the dependence on T and c. Note that < is a pre-order, i.e. a transitive and 
reflexive relation, and that ~ is an equivalence relation. We will also need the 
projections resp. < y ,~y of these relations onto the set px[F] Q X 

resp. py[r] C Y. The projection is defined in the obvious way: x <x x if there 
exist y, y such that (x, y), (x, y) G T and (x, y) < (x, y) holds. 

The other relations are defined analogously. The projections of < are again 
pre-orders and the projections of ~ are again equivalence relations, provided c 
is finite on V. The equivalence classes of ~ and its projections are compatible 
in the sense that [(x,y)]~ = ([x]~ x x [y]^) D T. The elementary proofs of 
these facts are left to the reader. 

The main objective of this section is to prove Proposition 13.21 based on several 
lemmas which will be introduced throughout the section. 
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Proposition 3.2 Let X, Y be Polish spaces equipped with Borel probability 
measures //, v and let c : X x Y — > [0, oo] be a Borel measurable cost function. 
Let tt be a finite transport plan. Assume that there exists a c-monotone set 
rcix7 with 7r(r) = 1 on which c is finite, such that (T, c) is connecting. 
Then tc is strongly c-monotone. 

In the proof of Proposition 13.21 we will establish the existence of the functions 
ip, ip using the construction given in [12] , see also [HI Chapter 2] and [TJ 
Theorem 3.2]. As we do not impose any continuity assumptions on the cost 
function c, we can not prove the Borel measurability of ip and ip by using 
limiting procedures similar to the methods used in [T|12f 131114] . Instead we 
will use the following projection theorem, a proof of which can be found in [2j 
Theorem III. 23] by analyists or in [51 Section 29. B] by readers who have some 
interest in set theory. 

Proposition 3.30 Let X and Y be Polish spaces, A C X a Borel measurable 
set and f : X — > Y a Borel measurable map. Then B := f(A) is universally 
measurable, i.e. B is measurable with respect to the completion of every a- 
finite Borel measure on Y . 

The system of universally measurable sets is a cr-algebra. If X is a Polish space, 
we call a function / : X — > [—00, 00] universally measurable if the pre-image 
of every Borel set is universally measurable. 

Lemma 3.4 Let X be a Polish space and fi a finite Borel measure on X . 
If (p : X —> [—00, 00) is universally measurable, then there exists a Borel 
measurable function <p : X — > [—00,00) such that (p < ip everywhere and 
ip = (p almost everywhere. 

PROOF. Let {I n )^Li be an enumeration of the intervals [a, b) with endpoints 
in Q and denote the completion of \i by fx. Then for each n G N, V9 _1 [/ n ] is 
/i-measurable and hence the union of a Borel set B n and a /i-null set N n . Let 
N be a Borel null set which covers U^Li Nn- Let <p(x) = <p(x) — 00 ■ 1n(x). 
Clearly <p(x) < ip{x) for all x G X and (p(x) = <p(x) for /i-almost all x G X. 
Furthermore, (p is Borel measurable since (I n ) n ° = i is a generator of the Borel 
a-algebra on [—00, 00) and for each n G N we have that y5 -1 [/ n ] = B n \ N is 
a Borel set. □ 



The following definition of the functions ip ni n G N resp. ip is reminiscent of 
the construction in [T2] . 

4 Sets which are images of Borel sets under measurable functions are called ana- 
lytic in descripitive set theory. Lusin first noticed that analytic sets are universally 
measurable. Details can be found for instance in [5]. 
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Lemma 3.5 Let X, Y be Polish spaces, c : X xY — > [0, oo] a Borel measurable 
cost function and T C X x Y a Borel set. Fix (xo,yo) G T and assume that c 
is finite on T. For neN, define p n : X xT n ^ (—00, 00] by 

n-1 

<p n (x;xi,yi,...,x n ,y n ) = [c(x, y n )-c(x n , J/ n )] + 53[c(a: i+ i, yi)-c{x i: y$\ (14) 

Then the map (p : X — > [—00, 00] defined by 

ip(x) = inf {<p n (x; x u y h . . . , x n , y n ) :n> 1, (x u yi) n i=x G T n } (15) 
is universally measurable. 



PROOF. First note that the Borel a-algebra on [—00, 00] is generated by 
intervals of the form [—00, a), thus it is sufficient to determine the pre-images 
of those sets under ip. We have 

ip(x) < a <-> 3n e N 3(xi,yi), . .., (x n , y n ) G T : ip n (x; xi,yi,..., x n , y n ) < a. 
The set v 9 n 1 [[ — 00, a)} is Borel measurable. Hence 

^ 1 [[-00,«)]= |J pxfa-^-OQ^)]] 

is the countable union of projections of Borel sets. Since projections of Borel 
sets are universally measurable by Proposition 13. 3[ 00, a)} belongs also 

to the a-algebra of universally measurable sets. □ 

Lemma 3.6 Let X, Y be Polish spaces and c : X x Y — > [0, 00] a Borel 
measurable cost function. Suppose T is c-monotone, c is finite on V and (T, c) 
is connecting. Fix (xo,yo) G T. Then the map ip from (1151) is finite onpx\P]- 
Furthermore 

ip(x) < ip(x') + c{x, y) - c{x', y) Va; G X, (x', y) G T. (16) 



PROOF. Fix x G Px[r]. Since xq < x (recall Definition 13. ip . we can find 
x 1 ,y 1 , . . . ,x n ,y n such that p n (x;x 1 ,y 1 , . . . ,x n ,y n ) < 00. Hence p(x) < 00. 
Proving ip(x) > —00 involves some wrestling with notation but, not very 
surprisingly, it comes down to applying the fact that x < x . Let ai = 
x and choose bi, 02, 62, • • • , a>m, b m such that (aj, 61), . . . , (a m , b m ) G T and 
c(a,2, bi), . . . , c(a m , & m _i), c(x, b m ) < 00. Assume now that x±, y±, . . . , x n , y n are 
given such that p n {x; xi, yi, . . . , x n , y n ) < 00. Put x n+i = a { and y n+i = bi for 
i G {1, . . . , to}. Due to c-monotonicity of T and the finiteness of all involved 
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terms we have: 

n+m— 1 

< [c(x , Vn+m) - C(x n+m , y n +m)] + [ C ( X i+liVi) ~ C ( X h 

i=0 

which, after regrouping yields 

m— 1 

a := [c(x , b m ) - c(a m , b m )) + [ c ( a m, h) ~ c ( a ^ h)} 

i=l 

n-l 

< [c(x,y n ) -c{x n ,y n )\ + ^2[c(x i+1 ,yi) - c{x h y t )]. (17) 

i=0 

Note that the right hand side of ffT7|) is just (p n (x; x%, y%, . . . , x n , y n ). Thus 
passing to the infimum we see that tp(x) > a > — oo. To prove the remaining 
inequality, observe that the right hand side of ffTB]) can be written as 

ini{ip n (x; Xx, y x , . . . , x n , y n ) : n > 1, (x h ^)f =1 G T n and (x n , y n ) = (x 1 , y)} 

whereas the left hand side of f|T6|) is the same, without the restriction (x n , y n ) = 
{x',y). □ 

Lemma 3.7 Let X,Y be Polish spaces and c : X x Y — > [0, oo] a Borel 
measurable cost function. Let Xq C X be a non-empty Borel set and let if : 
Xq — > M. be a Borel measurable function. Then the c-transform ip : Y —>■ 
[—00,00), defined as 

^(y) ■= in | [c(x,y) - <p(x)] (18) 
is universally measurable. 

PROOF. As in the proof of Lemma [3.41 we consider the set 00, a)]: 

ifj{y) < a <->• 3x G X : c(x, y) — (f(x) < a. 
Note that the set {(x, y) G X x Y : c(x,y) — <p(x) < a} is Borel. Thus 

^ _1 [[-oo,a)] =Px[{{x,y) E X xY : c(x,y) - (p(x) < a}] 
is the projection of a Borel set, hence universally measurable. □ 

We are now able to prove the main result of this section. 

PROOF of Proposition 13.21 Let r C X x Y be a c-monotone Borel set 
such that 7r(r) = 1 and the pair (r, c) is connecting. Let (p be the map from 
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Lemma 13.51 Using Lemma 13.41 and Lemma 13. 6[ and eventually passing to a 
subset of full 7r-measure, we may assume that ip is Borel measurable, that 
X := px[r] is a Borel set and that 

c{x',y)-<p{x') <c{x,y)-cp{x) Mx G X , (x', y) G T. (19) 

Note that (j!9p follows from (TIB]) in Lemma 13.61 Here we consider x G X in 
order to ensure that ip(x) is finite on X Q . Now consider the c-transform 

ip(y) ■= inf [c(x,y) - <p(x)], (20) 

which by Lemma [3.71 is universally measurable. Fix y G Pk[T]. Using (j!9p we 
see that the infimum in (1211 is attained at a point x G X satisfying (x , y) G 
T. This implies that <p(x) + ip(y) = c(x, y) on Y and (p(x) + ip(y) < c(x, y) on 
Px\y] x Pv[r]. To guarantee this inequality on the whole product IxF, one 
has to redefine (p and ip to be — oo on the complement of Px\T] resp. py[T]. 
Applying Lemma 13.41 once more, we find that there exists a Borel set iV C Y 
of zero ^-measure, such that ip(y) = ip{y) — oo ■ 1n(u) is Borel measurable. 
Finally, replace r by T n {X x (Y \ N)) and ip by □ 

We conclude this section by proving that every strongly c-monotone transport 
plan is optimal (Proposition [TT5]) . 

Let X, Y be Polish spaces equipped with Borel probability measures fi, v and 
let c : X x Y — * [0, oo] be Borel measurable. Then every finite transport plan 
which is strongly c-monotone is optimal. 

PROOF. Let 7r be a strongly c-monotone transport plan. Then, according 
to the definition, there exist Borel functions <p(x) and ip(y) taking values in 
[— oo, oo) such that 

<p(x)+if)(y)<c(x,y) (21) 
everywhere on X x Y and equality holds 7r -a.e. We define the truncations 
<Pn = (n A (v? V -n)), = (n A (ip V -n)) and let £ n (x, y) := (p n (x) + ip n (y) 
resp. £(x, y) := <p(x) + ip(y). Note that <p n ,ipn,^mC are Borel measurable. By 
elementary considerations which are left the reader, we get pointwise monotone 
convergence £ n | ^ on the set > 0} resp. £ n | ^ on the set {£ < 0} Let 
7Ti be an arbitrary finite transport plan; to compare / c [7r ] and J c [7Ti] we make 
the following observations: 

a. By monotone convergence 



/ €nd-Ki T / Zd-Ki < I c [-Ki] < oo and (22) 
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for i G {0, 1}, hence lim^oo / £ n d^i = / £ diii. 
b. By the assumption on equal marginals of ttq and tti we obtain for n > 



Thus J C [7T ] = / £ c/7r = limn^oo / d7r = lim^oo f£ n dir 1 = f£ dwi < I c [ni] ; 
since tti was arbitrary, this implies optimality of 7r . □ 



4 Prom c-monotonicity to optimality 

This section is devoted to the proof of Theorem [TJb. Our argument starts with 
a finite c-monotone transport plan ir and we aim for showing that tt is at least 
as good as any other finite transport plan. The idea behind the proof is to 
partition X and Y into cells C^i G I resp. D iy i G / in such a way that tt 
is strongly c-monotone on "diagonal" sets of the form Cj x Z), while regions 
CiXDj,i ^ j can be ignored, because no finite transport plan will give positive 
measure to the set Cj x Dj. 

Thus it will be necessary to apply previously established results to some re- 
stricted transport problems on a space Cj x Di equipped with some relativized 
transport plan tt \ Ci x D^. As in general the cells Cj, Di are plainly Borel sets 
they may fail to be Polish spaces with respect to the topologies inherited from 
X resp. Y. However, for us it is only important that there exist some Polish 
topologies that generate the same Borel sets on Ci resp. Di (see e.g. [5j Theo- 
rem 13.1]). At this point it is crucial that our results only need measurability 
of the cost function and do not ask for any form of continuity (cf. the remarks 
at the end of the introduction). Before we give the proof of Theorem [TJb we 
will need some preliminary lemmas. 

Lemma 4.1 Let X,Y be Polish spaces equipped with Borel probability mea- 
sures fi, v and let c : X x Y — >• [0, oo] be a Borel measurable cost function. Let 
it, ttq be finite transport plans and T C X x Y a Borel set with tt(T) = 1 on 
which c is finite. Let I — {0, ... ,n} or I = N and assume that Ci,i G I are 
mutually disjoint Borel sets in X, D^,i G I are mutually disjoint Borel sets 
in Y such that the equivalence classes of ~ Cj r are of the form T fl (Ci x Di). 
Then also ^(Uig/Ci x A) = 1- 

In the proof we will need the following simple lemma. (For a proof see for 
instance [H Proposition 8.13].) 

Lemma 4.2 Let I = {0, . . . ,n} or I = N and let P = (pij)i,jei be a matrix 
with non-negative entries such that J2jeiPi j — 1 f° r eac h io £ I- Assume that 




(24) 



(25) 
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there exists a vector (pi)ie/ with strictly positive entries such p-P = p\ 5 \ Then 
whenever p ioil > for i , i\ G I, there exists a finite sequence i , ii, . . . , i n = io 
such that for all < k < n one has Pi k i k+1 > 0. 



PROOF of Lemma 14.11 As ~r,c is an equivalence relation and 7r is concen- 
trated on T, the sets Cj, j G / are a partition of X modulo /i-null sets. Likewise 
the sets Di, i G / form a partition of F modulo z/-null sets. In particular the 
quantities 

Pi := n(d) = v(Di) = 7i(d x A), iel (26) 

add up to 1. Without loss of generality we may assume that pi > for all 
iEl. We define 

n (Ci x D~) r , . 

Then Y,jeiPi j = ^§0) = 1 f° r eac ' 1 G ^- By the condition on the 
marginals of 7r we have for the z-th component of p-P 

(p-P), = Y.Mi f^'tZ^ = ^ X x A) = KA) = Pi 

i.e. p-P = p. Hence P satisfies the assumptions of Lemma [4. 2 [ We claim that 
Pa = 1 for all i G I. Suppose not. Pick io & I such that p ioio < 1. Then 
there exists some index ii ^ i such that p ioil > 0. Pick a finite sequence 
io, ii, . . . ,i n — i according to Lemma H~2l Fix k G {1, . . . , n — 1}. Then 

ixo(C ik x D ik+1 ) =p ikik+1 > 0. 

Since tt is a finite transport plan, there exist Xk G Cj fc PI [r] and y' k+1 G 
A fc+1 n^yfr] such that c(x k ,y' k+1 ) < oo. Choose w fc G A fc and G C ifc+1 
such that (x k ,y k ), (x k+1 ,y k+1 ) G T. Then 

(zo,i/o) ^ (4>2/i) ~ ^ (4)2/2) ~ (^2, 2/2) < ... < (4>2/n) ~ (^0,2/0)- 

But this implies that (xo,yo) ~ (xi,yi), contradicting the assumption that 
(Cj x Dj ) fl T, (C^ x Ai) H r are different equivalence classes of ~r, c - Hence 
we have indeed pa = 1 for all i G /, thus 7ro(Cj x Di) = fi(Ci) which implies 
iro{l)ieiQ x Di) = 1. □ 

Lemma 4.3 Lei X, F 6e Polish spaces equipped with Borel probability mea- 
sures fi, v and let c : X x F — > [0, 00] 6e a Borel measurable cost function 
which is fi ® v-a.e. finite. For every finite transport plan tc and every Borel 
set r C X x F TOi/i 7r(r) = 1 on which c is finite, there exist Borel sets 
O C X,U C F suc/i t/tat f = T fl (O x (7) /ias /itZZ -^-measure and (T', c) is 
connecting. 



Such a matrix P is often called a stochastic matrix while p is a stochastic vector. 
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PROOF. By Fubini's Theorem for //-almost all x G X the set {y : c(x, y) < 
00} has full //-measure and for 1/- almost all y G Y the set {x : c(x,y) < 00} 
has full /i-measure. In particular the set of points (x ,y ) such that both 
\x ({x : c(x,y ) < 00}) = 1 and v ({y : c(x ,y) < 00}) = 1 has full 7r-measure. 
Fix such a pair (xo, yo) G T and let = {x G X : c(x, y$) < 00}, [/ = {y G 
K : c(a;o, y) < 00}. Then r' = F n (O x U) has full 7r-measure and for every 
(x,y) G r' both quantities c(x,yo) and c(xo,y) are finite. Hence x ~x %o, f° r 
every a; G Px[r']- Similarly we obtain y ^ Y y , for every y G Py[F"]. Hence 
(T', c) is connecting. □ 

Finally we prove the statement of Theorem [Tib: 

Let X, Y be Polish spaces equipped with Borel probability measures fi, v and 
c : X x Y — > [0, 00] a Borel measurable cost function. Every finite c-monotone 
transport plan is optimal if there exist a closed set F and a fi ® v-null set N 
such that {(x, y) : c(x, y) = 00} = F U N. 

PROOF. Let it be a finite c-monotone transport plan and pick a c-monotone 
Borel set T C X x K with 7r(T) = 1 on which c is finite. 

Let O n ,U n ,n G N be open sets such that {JneniOn x C^n) — (X x F) \ F. 
Fix n G N and interpret 7r \ O n x [/„ as a transport plan on the spaces 
(O n , fin) and (U n ,u n ) where /x n and i/ n are the marginals corresponding to 
Ti \ O n x U n . Apply Lemma 14.31 to T n (O n x U n ) and the cost function 
c \ O n x C/ n to find O; C O n , C/; C U n and r n = r n {0' n x C^) with 
7r(r n ) = 7r(rn (O n x C/ n )) such that (T n , c) is connecting. Then T = U n eN T n is 
a subset of V of full measure and every equivalence class of ~f c can be written 
in the form ((UnGAf 0' n ) x (Ungv ^n)) for some non-empty index set iV C N. 
Thus there are at most countably many equivalence classes which we can write 
in the form (Cj x Dj) Pi T, i G / where 7 = {1, . . . , n} or I — N. Note that by 
shrinking the sets Cj, Dj, z G 7 we can assume that Cj fi Cj = n Dj = for 
2 7^ 3- 

Assume now that we are given another finite transport plan ir . Apply Lemma 
14. ll to 7r, 7r and F to achieve that 7r is concentrated on U i6J C« x Di- For z G 7 
we consider the restricted problem of transporting fi \d to v \T)%. We know 
that 7r \ Ci x TJj is optimal for this task by Propositions 11.51 and I3.2[ hence 
7 c [tt] < J c [tt ]. □ 

Remark 4.4 In /ac£ i/ie following somewhat more general (but also more 
complicated to state) result holds true: Assume that {(x, y) : c(x,y) = 00} C 
F U N where F is closed and N is a \x <E> z/-null set. Then every c-monotone 
transport plan it with 7r(F U N) = is optimal. 
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5 Completing the picture 



First we give the proof of Theorem [2J 

Let X, Y be Polish spaces equipped with Borel probability measures fi, v and 
c:IxF-> [0,oo] a Borel measurable cost function. For a finite transport 
plan tt the following assertions are equivalent: 

a. tt is robustly optimal. 

b. it is strongly c-monotone. 



PROOF, a. =>■ b.: Let Z and A 7^ be according to the definition of robust 
optimality. As tt = (Idz x Id^)#A + tt is optimal, Theorem [TJa ensures the 
existence of a c-monotone Borel set T C (X U Z) x (F U Z) such that c is finite 
on T and tt is concentrated on T. Note that (z, z) G V for A-a.e. z G Z. We 
claim that for A-a.e. z G Z and all (x,y) G T = T D (X x F) the relation 



0,2/) 



(2* 



holds true. Indeed, since c is finite on Z x F we have c(z, y) < 00 hence 
(x,y) <f g (2,2)- Analogously finiteness of c on X x Z implies c(x,z) < 00 
such that also (z,z) <f g (x,y). By transitivity of ~f g , (r, c) is connecting. 
Applying Proposition 13.21 to the spaces IUZ and F U Z we get that tt is 
strongly c-monotone, i.e. there exist (p and ^ such that (p[a) + ?/>(&) < c(a, b) 
for (a, b) G (X U Z) x (F U Z) and equality holds 7f-almost everywhere. By 
restricting (p and ip to X resp. F we see that tt is strongly c-monotone. 



b. =>- a.: Let Z be a Polish space and let A be a finite Borel measure on Z. We 
extend c to c : (X U Z) x (F U Z) -> [0, 00] via 



c(a, 6) 



c(a, 6) for (a, 6) G X x F 

max (<p(a), 0) for (a, b) G X x Z 

max (-0(6), 0) for (a, 6) G Z x F 

otherwise. 



Define 



<p(a) for a G X 
for a G Z 



and ?/>(&) : = 



ip(b) for b G F 
for 6 G Z. 



Then ip resp. ?/> are extensions of ip resp. ip to XU Z resp. F U Z which satisfy 
<p(a)+ip(b) < c(a, b) and equality holds on Y = TU {(z, z) : z G Z}. Hence T is 
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strongly c-monotone. Since tt is concentrated on T, tt is optimal by Proposition 

On 



Next consider Theorem [3j 

Let X, Y be Polish spaces equipped with Borel probability measures fi, v and 
let c : X x Y — > [0, oo] be Borel measurable and fi ® v-a.e. finite. For a finite 
transport plan tt the following assertions are equivalent: 

(1) tt is optimal. 

(2) tt is c-monotone. 

(3) tt is robustly optimal. 

(4) tt is strongly c-monotone. 

PROOF. By Theorem [21 (3) and (4) are equivalent and they trivially imply 
(1) and (2) which are equivalent by Theorem HJ It remains to see that (2) =>- 
(4). Let tt be a finite c-monotone transport plan. Pick a c-monotone Borel set 
r C X x Y such that c is finite on T and tt(T) = 1. By Lemma 14.31 there 
exists a Borel set fCT such that vr(r') = 1 and (r', c) is connecting, hence 
Proposition 13.21 applies. □ 

Finally the example below shows that the (/i£g>z/-a.e.) finiteness of the cost 
function is essential to be able to pass from the "weak properties" (opti- 
mality, c-monotonicity) to the "strong properties" (robust optimality, strong 
c-monotonicity) . 

Example 5.1 (Optimality does not imply strong c-monotonicity) Let 

X = Y = [0, 1] and equip both spaces with Lebesgue measure A = fi = v. Define 
c to be oo above the diagonal and 1 — y/x — y for y < x. The optimal (in this 
case the only finite) transport plan is the Lebesgue measure tt on the diagonal 
A. We claim that tt is not strongly c-monotone. Striving for a contradiction 
we assume that there exist if and ip witnessing the strong c-monotonicity. Let 
Ai be the full-measure subset of A on which ip + ip = c, and write px[Ai] for 
the projection of A\. We claim that 

Vx,x' G pxlAi] : If x < x f , then (p(x) — f(x') > \Jx' — x, (29) 

which will yield a contradiction when combined with the fact that px[A\\ is 
dense. 

Our claim (|29~!) follows directly from 

(f(x') + ip(x) < c(x', x) = 1 — y/x' — x and (p(x) + ip(x) = c(x, x) = 1. (30) 
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Now let x < x + a be elements ofpx[^i], let b := ip(x) — ip(x') , and letnEN be 
a sufficiently large number, say satisfying n > 2^. Using the fact that £>x[Ai] 
is dense, we can find real numbers x = x < X\ < • • • < x n — x + a in A 1 
satisfying x k — Xk-i < 2/n for k = 1, . . . , n. 

Let €k '■= Xk — Xk-i for k = 1, . . . ,n. Then we have Ek < \ < for all k, 
hence ^fe~k > ^Sk- So we get 

n n n b b n 

k=l k=l k=l a a k=l 

a contradiction. (By letting c = below the diagonal the argument could be 
simplified, but then we would lose lower semi- continuity of c.) 
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